This document was created from an R markdown file. The repository for the project can be found here. The data reported in the paper can be explored interactively at the Metalab website.
1 Reported model results
The tables below show the estimates for the single-moderator models reported in the main text. For the categorical variables, the level reported in the table is the first level appeared in the parentheses. Across all the single-predictor model, the predicate type is significant, such that hearing transitive sentences have a positive effect on the effect size. We also found that median vocabulary size is a marginally significant moderator.
1.1 Mean age (Days??)
Parameter
Estimate
z value
p value
Intercept
0.62 [0.1, 1.13]
2.34
0.02*
Mean Age
<.001
-1.64
0.1
1.2 Median productive vocabulary size
Parameter
Estimate
z value
p value
Intercept
0.59 [0, 1.17]
1.97
0.05*
Median productive vocabulary size
-0.01 [-0.02, <.001]
-1.93
0.05
1.3 Predicate Type
Parameter
Estimate
z value
p value
Intercept
0.08 [-0.17, 0.33]
0.65
0.52
Predicate type (Transitive / Intransitive)
0.24 [0.02, 0.46]
2.13
0.03*
1.4 Noun phrase type
Parameter
Estimate
z value
p value
Intercept
0.17 [-0.09, 0.43]
1.25
0.21
Noun phrase type (Pronoun / Nount)
0.14 [-0.26, 0.53]
0.69
0.49
1.5 Character identification phase
Parameter
Estimate
z value
p value
Intercept
0.16 [-0.09, 0.42]
1.25
0.21
Character identification phase (Yes / No)
0.2 [-0.27, 0.67]
0.84
0.4
1.6 Practice phase
Parameter
Estimate
z value
p value
Intercept
0.35 [0.08, 0.62]
2.55
0.01*
Practice phase (Yes / No)
-0.23 [-0.53, 0.06]
-1.54
0.12
1.7 Synchronicity
Parameter
Estimate
z value
p value
Intercept
0.17 [-0.08, 0.42]
1.31
0.19
Synchronicity (Simultaneous / Asynchronous)
0.13 [-0.18, 0.43]
0.83
0.41
1.8 Testing structure
Parameter
Estimate
z value
p value
Intercept
0.13 [-0.1, 0.36]
1.1
0.27
Testing Procedure Structure (Mass / Distributed)
0.38 [-0.09, 0.85]
1.6
0.11
1.9 Number of sentence repetitions
Parameter
Estimate
z value
p value
Intercept
0.17 [-0.14, 0.47]
1.08
0.28
Number of sentence repetitions
0.01 [-0.02, 0.03]
0.51
0.61
2 Reported model results using dataset without imputed values
As mentioned in the method sections, for studies missing relevant statistics, we imputed values from studies with similar design (e.g. Hirsh-Pasek, Golinkoff,& Naigles, 1996). The tables below report the model results from fitting the exact same models on the dataset excluding the imputed study. There was no signifcant difference between the outcomes from the two datasets.
2.1 Mean age (Days??)
Parameter
Estimate
z value
p value
Intercept
0.6 [0.05, 1.16]
2.15
0.03*
Mean Age
-0.001 [<.001, <.001]
-1.57
0.12
2.2 Median productive vocabulary size
Parameter
Estimate
z value
p value
Intercept
0.59 [0, 1.17]
1.97
0.05*
Median productive vocabulary size
-0.01 [-0.02, <.001]
-1.93
0.05
2.3 Predicate Type
Parameter
Estimate
z value
p value
Intercept
0.06 [-0.22, 0.34]
0.43
0.67
Predicate type (Transitive / Intransitive)
0.26 [0.02, 0.49]
2.15
0.03*
2.4 Noun phrase type
Parameter
Estimate
z value
p value
Intercept
0.15 [-0.13, 0.44]
1.05
0.29
Noun phrase type (Pronoun / Nount)
0.14 [-0.3, 0.58]
0.63
0.53
2.5 Character identification phase
Parameter
Estimate
z value
p value
Intercept
0.13 [-0.17, 0.42]
0.84
0.4
Character identification phase (Yes / No)
0.24 [-0.27, 0.74]
0.92
0.36
2.6 Practice phase
Parameter
Estimate
z value
p value
Intercept
0.34 [0.05, 0.64]
2.26
0.02*
Practice phase (Yes / No)
-0.25 [-0.57, 0.07]
-1.56
0.12
2.7 Synchronicity
Parameter
Estimate
z value
p value
Intercept
0.15 [-0.13, 0.42]
1.05
0.3
Synchronicity (Simultaneous / Asynchronous)
0.13 [-0.19, 0.46]
0.81
0.42
2.8 Testing structure
Parameter
Estimate
z value
p value
Intercept
0.09 [-0.17, 0.35]
0.68
0.5
Testing Procedure Structure (Mass / Distributed)
0.42 [-0.07, 0.91]
1.69
0.09
2.9 Number of sentence repetitions
Parameter
Estimate
z value
p value
Intercept
0.14 [-0.19, 0.47]
0.83
0.4
Number of sentence repetitions
0.01 [-0.02, 0.03]
0.54
0.59
3 Additional Models: Methodological Moderators with Theoretical Moderators
Syntactic Bootstrapping studies differ in their implementational details. Here we examine to what extent the influences of the theoretical moderators can be accounted for by the methodological factors. The tables below present the results of models that include all the key methodological moderators and one of the theoretical moderators. The patterns were consistent with the single-predictor theoretical models: predicate type is still a significant predictor of the effect.
The tables here present some exploratory moderators. The level represented in the table is the first one in the parenthesis.
4.1 Patient argument type for transitive sentence
In the main analysis, we presented the results of the model for the relationship between effect size and the agent argument type. We found that having nouns or pronouns int he agent argument does not significantly predict the effect size. Here, we presented a similar analysis of the influence of the patient argument type. Because by definition English intransitive sentences do not have patient argument, we focus on the subset of studies that used the transitive sentences (\(N\) = 30)
Parameter
Estimate
z value
p value
Intercept
0.28 [0.01, 0.54]
2.06
0.04*
Patient Argument Type (Pronoun / Noun)
-0.05 [-0.48, 0.38]
-0.22
0.83
4.2 Stimuli Modality
We found that the presentation modality of the stimuli was not a significant predictor of the effect size. In other words, studies that presented young children with animation clips had similar effect sizes as studies using video clips. The model statistics are shown below. Note that the stimuli modality and the stimuli actor levels had a lot of overlapping studies, so researchers should interpret this result with caution.
Parameter
Estimate
z value
p value
Intercept
0.58 [0.11, 1.05]
2.4
0.02*
Stimuli Modality (Video / Animation)
-0.38 [-0.83, 0.07]
-1.65
0.1
4.3 Stimuli actors
There is a marginal effect of stimuli actor. Studies with human actors as protagonists in the events had relatively smaller effect sizes as studies using puppets, human actors in animal suits, or using animated geometrical shapes. This might due to the relatively higher visual complexity associated with stimuli using real human actors.
Parameter
Estimate
z value
p value
Intercept
0.42 [0.12, 0.71]
2.79
0.01*
Stimuli Actor (Person / Non-person)
-0.31 [-0.64, 0.02]
-1.85
0.06
4.4 Type of event
Studies differed in the type of transitive events and intransitive events they presented. Previous studies have shown that young children’s looking behaviors in Inter-modal Preferential Looking Paradigm were very sensitive to the subtle perceptual differences in the visual stimuli (Delle Luche, Durrant, Poltrock, & Floccia, 2015; Fernald, Zangl, Portillo, & Marchman, 2008). Therefore, we coded the types of events presented in the visual stimuli. There were two types of transitive events: direct causal action and indirect causal action. The former involved the agent directly acting on the patient and causing the patient to move. The latter involved a mean-end sequence leading to the caused action of the patient. For example, the agent may pull a band on the patient’s waist and caused it to move. There were also two types of intransitive events used in the literature. One involved a single actor acting, such as jumping up and down. The other involved two actors presented without any causal action.
Our model suggested that neither of the variables was predictive of the effect sizes.
4.4.1 Transitive Event type
Parameter
Estimate
z value
p value
Intercept
0.22 [-0.01, 0.44]
1.89
0.06
Transitive Event Type (Indirect caused action / Direct caused action)
0.03 [-0.34, 0.4]
0.16
0.87
4.4.2 Intransitive event type
Parameter
Estimate
z value
p value
Intercept
0.25 [-0.11, 0.62]
1.36
0.17
Intransitive Event Type (Parallel actions / One action)
-0.04 [-0.38, 0.31]
-0.2
0.84
5 Details of calculating effect size
To standardize the effect size calculation, we converted some reported raw results to the proportion of correct responses. For looking time studies, when the paper only reported the raw looking time in seconds, we calculated the proportion of correct response by dividing the mean looking time toward the matching scene by the sum of looking time toward the matching scenes and non-matching scenes (i.e., excluding the look away time from the denominator). The raw standard deviations were also converted to the corresponding values by being divided by the sum.
Below is a step-by-step example calculation using data in Yuan & Fisher (2009) Experiment 1
Raw data from the Yuan & Fisher (2009, pg 622) Table 1. The values are Mean looking time in seconds, and in parentheses are SE.
Dialogue Type
Sample Size
Two-participant Event
One-participant Event
Transitive
8
4.82 (0.43)
2.87 (0.51)
Intransitive
8
3.33 (0.24)
4.12 (0.40)
When the paper only provides raw looking time data, we converted the data into proportion of correct looking time and the variances following the formulae below.
Converted data from raw looking time to proportion of looking time to the correct scenes. For children hearing transitive sentences, the correct scene was the Two-participant Event. For children hearing intransitive sentences, the correct scene was the One-participant Event. Standard Deviations were calculated by scaling the raw SE first, and then multiplied by the square roots of the number of participants.
Dialogue Type
Sample Size
Mean Proportion
Standard Deviation
Transitive
8
0.627
0.158
Intransitive
8
0.553
0.152
Then we calculate Cohen’s d and the variances as follows (the implemetation of the script can be found at XXX)
The plot below shows a modified funnel plot, or “significance funnel” where significant studies are shown in orange and non-significant studies are shown in grey. The x-axis shows effect size estimates, and the y-axis shows estimated standard error for each estimate. Studies lying on the grey line have a p-value of .05. The black diamond shows the meta-analytic effect size estimate for all studies; the grey diamond shows the meta-analytic effet size estimate for significant studies only (the “worst-case” publication scenario). Note that the worst case scenario appreciable attenuates the effect size estimate, but does not attentuate the point estimate to 0 (worst case estimate: 0.08 [-0.1, 0.26]).
7 Heat Map
The heatmaps below showed the overlappings between moderators. Each cell corresponds to the co-occurrence between two moderator levels. Brighter colors indicate a higher frequency of co-occurrence, and darker colors indicate lower frequency. You can hover your mouse on the heatmap to see the corresponding value and combination of each cell.
7.1 Ordered by Row Average
7.2 Ordered by groups
8 Variability in visual stimuli as a function of age
There was some evidence for researchers adapting the level of visual complexity in the visual stimuli according to children’s age. We collected the available visual stimuli from the papers and the supporting materials. Schematic illustrations of the visual stimuli were used when the actual screenshots were not provided. Screenshots of the text descriptions of the events were used when the visual stimuli were unavailable. Note that because some papers’ publishers converted to the visual stimuli to black-and-white, we decided to grayscale all visual stimuli for easier visual comparison.
It is easy to see in the plot that studies for particularly young children used significantly simpler visual stimuli. This adaptation might be partly responsible for the lack of age effect observed in our samples.
References
Delle Luche, C., Durrant, S., Poltrock, S., & Floccia, C. (2015). A methodological investigation of the Intermodal Preferential Looking paradigm: Methods of analyses, picture selection and data rejection criteria. Infant Behavior and Development, 40, 151-172
Fernald, A., Zangl, R., Portillo, A. L., & Marchman, V. A. (2008). Looking while listening: Using eye movements to monitor spoken language. Developmental psycholinguistics: On-line methods in children’s language processing, 44, 97.